30 research outputs found

    Capacity of DNA Data Embedding Under Substitution Mutations

    Full text link
    A number of methods have been proposed over the last decade for encoding information using deoxyribonucleic acid (DNA), giving rise to the emerging area of DNA data embedding. Since a DNA sequence is conceptually equivalent to a sequence of quaternary symbols (bases), DNA data embedding (diversely called DNA watermarking or DNA steganography) can be seen as a digital communications problem where channel errors are tantamount to mutations of DNA bases. Depending on the use of coding or noncoding DNA hosts, which, respectively, denote DNA segments that can or cannot be translated into proteins, DNA data embedding is essentially a problem of communications with or without side information at the encoder. In this paper the Shannon capacity of DNA data embedding is obtained for the case in which DNA sequences are subject to substitution mutations modelled using the Kimura model from molecular evolution studies. Inferences are also drawn with respect to the biological implications of some of the results presented.Comment: 22 pages, 13 figures; preliminary versions of this work were presented at the SPIE Media Forensics and Security XII conference (January 2010) and at the IEEE ICASSP conference (March 2010

    On the embedding capacity of DNA strands under insertion, deletion and substitution mutations

    Get PDF
    Paper presented at Media Forensics and Security XII, SPIE-IS&T Electronic Imaging conference, 18–20 January 2010, San Jose, CaliforniaA number of methods have been proposed over the last decade for embedding information within deoxyribonucleic acid (DNA). Since a DNA sequence is conceptually equivalent to a unidimensional digital signal, DNA data embedding (diversely called DNA watermarking or DNA steganography) can be seen either as a traditional communications problem or as an instance of communications with side information at the encoder, similar to data hiding. These two cases correspond to the use of noncoding or coding DNA hosts, which, respectively, denote DNA segments that cannot or can be translated into proteins. A limitation of existing DNA data embedding methods is that none of them have been designed according to optimal coding principles. It is not possible either to evaluate how close to optimality these methods are without determining the Shannon capacity of DNA data embedding. This is the main topic studied in this paper, where we consider that DNA sequences may be subject to substitution, insertion, and deletion mutations.Science Foundation Irelan

    Gene Tagging and the Data Hiding Rate

    Get PDF
    23nd IET Irish Signals and Systems Conference, Maynooth, Ireland, 28-29th June, 2012We analyze the maximum number of ways in which one can intrinsically tag a very particular kind of digital asset: a gene, which is just a DNA sequence that encodes a protein. We consider gene tagging under the most relevant biological constraints: protein encoding preservation with and without codon count preservation. We show that our finite and deterministic combinatorial results are asymptotically—as the length of the gene increases— particular cases of the stochastic Gel’fand and Pinsker capacity formula for communications with side information at the encoder, which lies at the foundations of data hiding theory. This is because gene tagging is a particular case of DNA watermarking.Science Foundation Irelan

    Optimum Exact Histogram Specification

    No full text
    2018 IEEE International Conference on Acoustics, Speech and Signal Processing, Calgary, Alberta, Canada (ICASSP-2018), 15-20 April 2018Exact histogram specification (EHS) is a classic image processing problem which generalises histogram equalisation. Over the years, no optimum solution to the EHS problem has been given with respect to any similarity criterion. An analytic and efficient solution to the optimum EHS problem, according to the mean squared error (MSE) criterion, is presented here. The inverse problem is also examined, and closed-form performance analyses are given in both cases

    On the Shannon capacity of DNA data embedding

    Get PDF
    2010 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Dallas, USA, March 14-19, 2010This paper firstly gives a brief overview of information embedding in deoxyribonucleic acid (DNA) sequences and its applications. DNA data embedding can be considered as a particular case of communications with or without side information, depending on the use of coding or noncoding DNA sequences, respectively. Although several DNA data embedding methods have been proposed over the last decade, it is still an open question to determine the maximum amount of information that can theoretically be embedded - that is, its Shannon capacity. This is the main question tackled in this paper.Science Foundation Irelandti ke SB. 26/7/1

    Optimum Reversible Data Hiding and Permutation Coding

    No full text
    7th IEEE International Workshop on Information Forensics and Security (WIFS), Rome, Italy, 16 - 19 November, 2015This paper is mainly devoted to investigating the connection between binary reversible data hiding and permutation coding. We start by undertaking an approximate combinatorial analysis of the embedding capacity of reversible watermarking in the binary Hamming case, which asymptotically shows that optimum reversible watermarking must involve not only 'writing on dirty paper', as in any blind data hiding scenario, but also writing on the dirtiest parts of the paper. The asymptotic analysis leads to the information-theoretical result given by Kalker and Willems more than a decade ago. Furthermore, the novel viewpoint of the problem suggests a near-optimum reversible watermarking algorithm for the low embedding distortion regime based on permutation coding. A practical implementation of permutation coding, previously proposed in the context of maximum-rate perfect steganography of memoryless hosts, can be used to implement the algorithm. The paper concludes with a discussion on the evaluation of the general rate-distortion bound for reversible data hiding.University College Dubli

    The role of permutation coding in minimum-distortion perfect counterforensics

    No full text
    39th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May, 2014This paper exploits the connection between minimum-distortion perfect counterforensics and maximum-rate perfect steganography in order to provide the optimum solution to the first of these problems, in the case in which the forensic detector solely uses first order statistics. The solution relies on Slepian’s variant I permutation codes, which had previously been shown to implement maximum rate perfect steganography when the host is memoryless (equivalently, when the steganographic detector only uses first-order statistics). Additionally, we demonstrate a blind counterforensic strategy made possible by permutation decoding, which may also find application in image processing.Science Foundation IrelandAD 28/04/201

    BLIND TURBO DECODING OF SIDE-INFORMED DATA HIDING USING ITERATIVE CHANNEL ESTIMATION

    No full text
    Distortion-Compensated Dither Modulation (DC-DM) has been theoretically shown to be a near-capacity achieving data hiding method, thanks to its use of side information at the encoder. In practice, channel coding is needed to approach its achievable rate limit. However, the most powerful coding methods, such as turbo coding, require knowledge of the channel model. We investigate here the possibility of undertaking blind iterative decoding of DC-DM. To this end, we undertake maximum likelihood estimation of the channel model, intertwining the Expectation-Maximization algorithm within the decoding procedure. 1

    Asymptotically Optimum Perfect Universal Steganography of Finite Memoryless Sources

    No full text
    A solution to the problem of asymptotically optimum perfect universal steganography of finite memoryless sources with a passive warden is provided, which is then extended to contemplate a distortion constraint. The solution rests on the fact that Slepian’s Variant I permutation coding implements firstorder perfect universal steganography of finite host signals with optimum embedding rate. The duality between perfect universal steganography with asymptotically optimum embedding rate and lossless universal source coding with asymptotically optimum compression rate is evinced in practice by showing that permutation coding can be implemented by means of adaptive arithmetic coding. Next, a distortion constraint between the host signal and the information-carrying signal is considered. Such a constraint is essential whenever real-world host signals with memory (e.g., images, audio, or video) are decorrelated to conform to the memoryless assumption. The constrained version of the problem requires trading off embedding rate and distortion. Partitioned permutation coding is shown to be a practical way to implement this trade-off, performing close to an unattainable upper bound on the rate-distortion function of the problem.Science Foundation Irelan
    corecore